NanoChat d32 Flash News List | Blockchain.News
Flash News List

List of Flash News about NanoChat d32

Time Details
2025-10-24
15:35
Karpathy Unveils SpellingBee for nanochat d32: Step-by-Step SFT/RL Finetuning Guide to Add Letter-Counting Capability and Its AI-Token Implications

According to @karpathy, he released a full guide showing how a new synthetic task called SpellingBee teaches nanochat d32 to count letters in words like strawberry by generating user-assistant training pairs and midtraining or SFT finetuning, with optional RL to improve robustness, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. The method stresses diverse user prompts, careful tokenization and whitespace handling, breaking reasoning into multiple tokens by standardizing the word, spelling it out, iterating with an explicit counter, and encouraging two solution paths via manual reasoning and Python tool use, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. Karpathy notes that because nanochat d32 is small, the capability is encouraged by over-representing examples in the dataset, and reliability can be further improved by simulating mistakes in data or running RL, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164. For traders, open-source progress on small LLM tooling has coincided with episodic attention flows to AI-linked crypto assets such as RNDR, FET, and AGIX around major AI catalysts, with Kaiko reporting AI token rallies around Nvidia earnings in 2024, source: Kaiko Research 2024 weekly market reports; Nvidia 2024 earnings releases. No token or product launch is included here; this is a technical training guide and example set for capability injection into a small LLM, source: Karpathy X post dated Oct 24, 2025; GitHub nanochat discussion 164.

Source
2025-10-21
15:59
Andrej Karpathy Unveils nanochat d32: $800 Synthetic-Data Custom LLM Identity and Script Release, Key Signals for AI Agent Builders

According to @karpathy, nanochat now carries a defined identity and can state its capabilities, including that it is nanochat d32 built by him with a reported $800 cost and weaker non-English proficiency, achieved via synthetic data generation, source: x.com/karpathy/status/1980508380860150038. He released an example script that demonstrates generating diverse synthetic conversations and mixing them into mid-training or SFT, stressing the importance of entropy to avoid repetitive datasets, source: x.com/karpathy/status/1980508380860150038. He adds that base LLMs lack inherent personality or self-knowledge and require explicitly bolted-on traits via curated synthetic data, source: x.com/karpathy/status/1980508380860150038. For traders, the disclosed $800 customization benchmark and open-source workflow provide concrete cost and process reference points for evaluating open-source AI agent development and adoption paths across AI-linked tokens and AI-exposed equities, source: twitter.com/karpathy/status/1980665134415802554.

Source
2025-10-16
00:14
Karpathy Unveils $1,000 nanochat d32: 33-Hour Train, CORE 0.31, GSM8K 20% — Watch AI Compute Tokens RNDR, AKT, TAO

According to @karpathy, the depth-32 nanochat d32 trained for about 33 hours at roughly $1,000 and showed consistent metric gains across pretraining, SFT, and RL (Source: Karpathy on X; Karpathy GitHub nanochat discussion). He reports a CORE score of 0.31 versus GPT-2 at about 0.26 and GSM8K improvement from around 8% to about 20%, indicating a notable uplift for a micro model (Source: Karpathy on X; Karpathy GitHub nanochat discussion). He cautions that nanochat costs $100–$1,000 to train and the $100 version is about 1/1000th the size of GPT-3, leading to frequent hallucinations and limited reliability compared to frontier LLMs, so user expectations should remain modest (Source: Karpathy on X). He adds that scripts including run1000 sh are available in the repo, he is temporarily hosting the model for testing, and he plans throughput tuning before possibly scaling to a larger tier (Source: Karpathy on X; Karpathy GitHub repository). For traders, decentralized GPU networks that market AI workload support such as Render (RNDR), Akash (AKT), and Bittensor (TAO) remain key watchlist names as open-source, low-cost training expands developer experimentation (Source: Render Network documentation; Akash Network documentation; Bittensor documentation).

Source